Picture for Yuchen Xie

Yuchen Xie

MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training

Add code
May 26, 2026
Viaarxiv icon

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

Add code
May 19, 2026
Viaarxiv icon

FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control

Add code
Apr 21, 2026
Viaarxiv icon

SparseBalance: Load-Balanced Long Context Training with Dynamic Sparse Attention

Add code
Apr 15, 2026
Viaarxiv icon

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Add code
Apr 11, 2026
Viaarxiv icon

AsyncTLS: Efficient Generative LLM Inference with Asynchronous Two-level Sparse Attention

Add code
Apr 09, 2026
Viaarxiv icon

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Add code
Mar 29, 2026
Viaarxiv icon

A Geometry-Adaptive Deep Variational Framework for Phase Discovery in the Landau-Brazovskii Model

Add code
Mar 05, 2026
Viaarxiv icon

SnapMLA: Efficient Long-Context MLA Decoding via Hardware-Aware FP8 Quantized Pipelining

Add code
Feb 12, 2026
Viaarxiv icon

Scaling Embeddings Outperforms Scaling Experts in Language Models

Add code
Jan 29, 2026
Viaarxiv icon